Skip to content

[MCP] aggregate_records DML tool#3199

Merged
souvikghosh04 merged 51 commits intomainfrom
Usr/sogh/aggregate_records
Mar 11, 2026
Merged

[MCP] aggregate_records DML tool#3199
souvikghosh04 merged 51 commits intomainfrom
Usr/sogh/aggregate_records

Conversation

@souvikghosh04
Copy link
Contributor

@souvikghosh04 souvikghosh04 commented Mar 6, 2026

Why make this change?

What is this change?

  • New aggregate_records DML tool (AggregateRecordsTool.cs) that generates SQL-level aggregation queries (COUNT, AVG, SUM, MIN, MAX) with support for DISTINCT, OData $filter (WHERE), GROUP BY, HAVING operators (eq, neq, gt, gte, lt, lte, in), ORDER BY (asc/desc), and cursor-based pagination (first/after) — all per the spec in [Enh]: add aggregate_records DML tool to MCP server #3178.
    query-timeout configuration for MCP runtime options — allows setting a per-query timeout (1–600 seconds, default 30s) via McpRuntimeOptions.QueryTimeout, validated at startup.
  • Config & CLI plumbing: DmlToolsConfig gains AggregateRecords/UserProvidedAggregateRecords; McpRuntimeOptionsConverterFactory handles query-timeout serialization; ConfigureOptions.cs updated for CLI configure support; JSON schema updated.
  • Telemetry: Aggregation operations emit OpenTelemetry traces with structured error codes.
  • Updated 37 CLI + 4 Service.Tests snapshot files to reflect new default properties.

How was this tested?

  • Integration Tests — AggregateRecordsToolTests.cs (all 13 spec examples + edge cases), McpQueryTimeoutTests.cs, EntityLevelDmlToolConfigurationTests.cs, McpToolRegistryTests.cs
  • Unit Tests — AggregateRecordsToolTests.cs (unit), McpTelemetryTests.cs, RequestParserUnitTests.cs, SqlQueryExecutorUnitTests.cs

Copilot AI and others added 30 commits February 28, 2026 05:55
Co-authored-by: JerryNixon <1749983+JerryNixon@users.noreply.github.com>
Co-authored-by: JerryNixon <1749983+JerryNixon@users.noreply.github.com>
Co-authored-by: JerryNixon <1749983+JerryNixon@users.noreply.github.com>
Co-authored-by: JerryNixon <1749983+JerryNixon@users.noreply.github.com>
Co-authored-by: JerryNixon <210500244+JerryNixon@users.noreply.github.com>
…eout to all MCP tools, add tests

Co-authored-by: JerryNixon <1749983+JerryNixon@users.noreply.github.com>
…test file

Co-authored-by: JerryNixon <1749983+JerryNixon@users.noreply.github.com>
Co-authored-by: anushakolan <45540936+anushakolan@users.noreply.github.com>
Co-authored-by: anushakolan <45540936+anushakolan@users.noreply.github.com>
…g performance by offloading computations to the database
Replace in-memory aggregation tests (PerformAggregation, ApplyPagination)
with SQL expression generation tests (BuildAggregateExpression,
BuildQuotedTableRef, DecodeCursorOffset). All 13 spec examples and 5
blog scenarios now validate SQL patterns instead of in-memory computation.

89 tests pass. Build and format clean.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- DecodeCursorOffset now rejects negative values (returns 0)
- Add max validation for 'first' parameter (100000 limit)
- Prevents integer overflow on first+1 and invalid SQL OFFSET
- Add tests for both edge cases

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Replace custom SQL string building with engine's SqlQueryStructure +
GroupByMetadata + queryBuilder.Build(structure) pattern. This uses the
same AggregationColumn, AggregationOperation, and Predicate types that
the engine's GraphQL aggregation path uses.

Removed methods: BuildAggregateSql, BuildAggregateExpression,
BuildQuotedTableRef, BuildWhereClause, BuildHavingClause,
AppendPagination. These are now handled by the engine's query builder.

Updated both test files to remove references to removed methods.
All 69 aggregate tests pass.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Fix COUNT(*): Use primary key column (PK NOT NULL, so COUNT(pk)
  COUNT(*)) instead of AggregationColumn with empty schema/table/'*'
  which produced invalid SQL like count([].[*])
- Fix TOP + OFFSET/FETCH conflict: Remove TOP N when pagination is
  used since SQL Server forbids both in the same query
- Add database type validation: Return error for PostgreSQL/MySQL/
  CosmosDB since engine only supports aggregation for MsSql/DWSQL
- Add HAVING validation: Reject having without groupby
- Add tests for star-field-with-avg, distinct-count-star, and
  having-without-groupby validation

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add 8 tests covering all 5 scenarios from the DAB MCP blog post
(devblogs.microsoft.com/azure-sql/data-api-builder-mcp-questions):
1. Strategic customer importance (sum/groupby/orderby desc/first 1)
2. Product discontinuation (sum/groupby/orderby asc/first 1)
3. Quarterly performance (avg/groupby/having gt/orderby desc)
4. Revenue concentration (sum/complex filter/multi-groupby/having)
5. Risk exposure (sum/filter/multi-groupby/having gt)

Each test verifies the exact blog JSON payload passes input validation,
plus tests for schema completeness, describe_entities instruction,
and alias convention documentation.

80 tests pass.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Remove redundant parameter listings from Description (already in
InputSchema). Description now covers only: workflow steps, rules
not expressed elsewhere, and response alias convention.

Parameter descriptions simplified to one sentence each, removing
repeated phrases like 'from describe_entities' and 'ONLY applies
when groupby is provided' (stated once in groupby description).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Validate field and groupby field names immediately after metadata
resolution, before authorization or query building. Invalid field
names now return a FieldNotFound error with model-friendly guidance
to call describe_entities for valid field names.

- Add McpErrorHelpers.FieldNotFound() with entity name, field name,
  parameter name, and describe_entities guidance
- Move field existence checks before auth in AggregateRecordsTool
- Remove redundant late validation (already caught early)
- Add tests for FieldNotFound error type and message content

82 tests pass.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Rename abbreviated variable names to their full, readable forms:
funcElfunctionElement, fieldElfieldElement, distinctEldistinctElement,
filterElfilterElement, orderbyElorderbyElement, firstElfirstElement,
afterElafterElement, groupbyElgroupbyElement, ggroupbyItem,
gValgroupbyFieldName, gFieldgroupbyField, havingElhavingElement,
havingOpshavingOperators, havingInhavingInValues, aggTypeaggregationType,
aggColumnaggregationColumn, predOppredicateOperation, ophavingOperator,
predpredicate, backingColbackingColumn, backingGColbackingGroupbyColumn,
timeoutExtimeoutException, taskExtaskCanceledException,
dbExdbException, argExargumentException/dabException.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
DAB config already has MaxResponseSize property that handles this
downstream through structure.Limit(). The engine applies the configured
limit automatically, making this artificial cap redundant.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: souvikghosh04 <210500244+souvikghosh04@users.noreply.github.com>
@souvikghosh04
Copy link
Contributor Author

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 6 pipeline(s).

@Aniruddh25
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 6 pipeline(s).

@souvikghosh04
Copy link
Contributor Author

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 6 pipeline(s).

@souvikghosh04
Copy link
Contributor Author

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 6 pipeline(s).

@souvikghosh04 souvikghosh04 moved this from In Progress to Review In Progress in Data API builder Mar 10, 2026
@Aniruddh25
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 6 pipeline(s).

Copy link
Contributor

@anushakolan anushakolan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really good work on the PR Souvik, I have some questions or comments that needs to be addressed. Otherwise, rest looks good to me, I will approve after these concerns are addressed.

@souvikghosh04
Copy link
Contributor Author

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 6 pipeline(s).

@souvikghosh04
Copy link
Contributor Author

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 6 pipeline(s).

@souvikghosh04 souvikghosh04 enabled auto-merge (squash) March 11, 2026 15:27
Copy link
Contributor

@anushakolan anushakolan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@souvikghosh04 souvikghosh04 merged commit 883f4a1 into main Mar 11, 2026
11 checks passed
@souvikghosh04 souvikghosh04 deleted the Usr/sogh/aggregate_records branch March 11, 2026 18:57
@github-project-automation github-project-automation bot moved this from Review In Progress to Done in Data API builder Mar 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

[Enh]: add aggregate_records DML tool to MCP server

5 participants